Backend of Programming Languages
1 Turing Completeness
1.1 TM as the limit of computation
Theorem (Church Turing Thesis)
Every algorithm has a TM that implements it.
Definition: Turing Completeness
A programming language is Turing-complete if it is equality expressive as Turing machines.
1.2 Elements of a Turing-complete language
- Data representation
- Array-like memory storage
- Loop
- Branch control
Java is a Turing complete language.
int[] numbers = new int[]{1, 2, 3, 4, 5};
while(...) {
...
}
if(...) {
...
} else {
...
}
1.3 Other Turing-complete languages
1.3.1 Python
= [1,2,3,4]
numbers while(...):
...
if ...:
...else:
...
1.3.2 Clojure
let [numbers [1 2 3 4]]
(
...)
loop ... (recur ...))
(if ...) (
1.3.3 SQL
SQL is also Turing-complete using recursive queries.
2 Minimal And Efficient Backend Language
2.1 Why not Turing Machine
- TM is too inefficient.
- TM lacks strong types.
- TM does not support modern processor’s optimization features.
2.2 Java Bytecode
A simple instruction based runtime environment consisting of:
- An operand stack (of 32-bit or 64-bit cells)
- Local variable array
- A heap for dynamic memory allocation
- Constant pool for defining constants
Operand stack with three values
| <x> |
| <y> |
| <z> |
+-----+
Each x
, y
or z
are either:
- integer, float, hot-spot reference (32b)
- long, double, references (64b)
| <x> |
| <y> | ---> | <w> |
| <z> | iadd | <z> |
+-----+ +-----+
iadd: add the two integers from the top of the stack, and place the result back onto the stack.
2.3 From Java to Bytecode
A minimal Java program.
public class Hello {
public static void main(String[] args) {
System.out.println("Hello World");
}
}
In JVM bytecode, this is converted into much simpler (but more verbose) list of instructions.
.class public Hello
.super java/lang/Object
Declaration of a Java class named Hello
.
.method public <init>()V
aload_0/lang/Object/<init>()V
invokenonvirtual javareturn
.end method
Declares a constructor which invokes the superclass constructor.
.method public static main([Ljava/lang/String;)V
.limit stack 2
/lang/System/out Ljava/io/PrintStream;
getstatic java"Hello world"
ldc /io/PrintStream/println(Ljava/lang/String;)V
invokevirtual javareturn
.end method
The public static void main(String[] args)
method that performs stack based computation.
- allocate a stack of maximum depth 2
- places an object reference on the stack (
System.out
). It also specifies its Java typejava.io.PrintStream
. - places a constant string on top of the stack.
- invokes a method
println(...)
which consumes two from the stack: the string and thePrintStream
.
2.4 Some examples of Bytecode programming
2.4.1 Adding numbers
For example, we omit the class declaration and constructor implmentation in bytecode. Below is the main
method implementation.
.method public static main([Ljava/lang/String;)V
.limit stack 100
.limit locals 2
200
ldc 300
ldc
iadd
istore_1
/lang/System/out Ljava/io/PrintStream;
getstatic java
iload_1/io/PrintStream/println(I)V
invokevirtual javareturn
.end method
Equivalent to the code:
= 200 + 300
x print(x)
2.4.2 Adding numbers using a loop
For this example, we are only going to show the body of the main
method. The complete bytecode would require the class declaration, constructor implementation, and the main
method declaration code. (See above).
.limit stack 100
.limit locals 2
We will have a larger stack in case we need more space for computation. We will have two local variables.
- The
this
variable that refers to the runtime object – this is always needed. - The accumulator integer.
; #1 is the total sum
iconst_0 istore_1
This initializes the local variable using the stack.
- Place
0
on top of the stack - Move the top of the stack to variable
#1
; load a counter value
1000 ldc
Store 1000
to the top of the stack. This will be the count-down value used during the loop.
Important: We will keep the count-down value in the stack until the end of the loop.
: LOOP
This is a label that marks the start of the loop.
; break the loop if reach zero
dup ifeq END_LOOP
Test the top of the stack to be zero. If it is zero, then break out of the loop by branching to END_LOOP
label.
Important: we duplicate the top of the stack so that that test ifeq
does not remove the original count-down value.
; add the counter value to sum
dup
iload_1
iadd istore_1
Add the count-down value (a copy) to the local variable #1
, and then store the result back to the local variable #1
.
; decrement counter
iconst_1 isub
Modify the count-down value on the stack by decrementing it by 1.
goto LOOP
: END_LOOP
End of the loop
; print the total
iload_1/lang/System/out Ljava/io/PrintStream;
getstatic java
iload_1/io/PrintStream/println(I)V
invokevirtual javareturn
After the loop, we print the value of the local variable #1
.
2.5 Towards a better programming interface
We need a programming language for:
- Semantics: Better abstraction of a runtime environment.
- Syntax: Better symbolic description of the computing process.
- Verification: Safer communication to minimize human errors.
- Compilation: Generate code into Java Bytecode for fast execution.